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DETAILED ACTION 
Claim Rejections - 35 USC § 102 

1. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

2. Claims 1-2, 4, 19-20, 22, and 31 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Garudadri et al. (US Patent No. 6671669). 

3. Regarding claim 1 , Garudadri et al. disclose a method for transcribing speech of 
a plurality of speakers, comprising: providing said speech to a plurality of speech 
decoders, each of said decoders using a speaker model for one of said speakers and 
generating a confidence score for each decoded output (figure 1 or referring to 8, In. 33- 
67); and selecting a decoded output based on said confidence score (the process of 
figures 7-9). 

4. Regarding claim 19, Garudadri et al. disclose a system for transcribing speech of 
a plurality of speakers, comprising: 

a memory that stores computer-readable code (col. 13, In. 18-60)] and a 
processor operatively coupled to said memory, said processor configured to implement 
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said computer-readable code (col. 13 } In. 18-60), said computer-readable code 
configured to: 

provide said speech to a plurality of speech decoders, each of said decoders 
using a speaker model for one of said speakers and generating a confidence score for 
each decoded output (figure 1 or referring to 8, In. 33-67); and select a decoded output 
having a highest confidence score (the process of figures 7-9). 

5. Regarding claim 31 , Garudadri et al. disclose an article of manufacture for 
transcribing speech of a plurality of speakers, comprising: 

a computer readable medium having computer readable code means embodied 
thereon (col. 13, In. 78-60), said computer readable program code means comprising: 

a step to provide said speech to a plurality of speech decoders, each of said 
decoders using a speaker model for one of said speakers and generating a 
confidence score for each decoded output (figure 1 or referring to 8, In. 33-67); and a 
step to select a decoded output having a highest confidence score (the process of 
figures 7-9). 

6. Regarding claims 2, 4, 20, and 22, Garudadri et al. further disclose the method 
and system of claims 1 and 19, further comprising the step of aligning each of said 
decoded outputs in time {the DTW Matching in elements 118, 120, and 122 in figure 1 is 
the aligning step), and the step of presenting said selected decoded output to a user 
{col. 8, In. 64 to col. 9, In. 2). 
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7. Claims 11-18, 23-30, and 32-33 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Murveit et al. (US Patent No. 6671 669). 

8. Regarding claim 1 1 , Murveit et al. disclose a method for transcribing speech of a 
plurality of speakers, comprising: providing said speech to a speaker independent 
speech recognition system and a speaker specific speech recognition system (referring 
to figures 3-5 or col. 4, In. 1 to col. 5, In. 67); and decoding said speech using said 
speaker independent speech recognition system whenever the identity of the current 
speaker is unknown (elements 206, 208, and 210 in figure 3). 

9. Regarding claim 15, Murveit et al. disclose a method for transcribing speech of a 
plurality of speakers, comprising: providing said speech to a speaker independent 
speech recognition system and a speaker specific speech recognition system (referring 
to figures 3-5 or col. 4, In. 1 to col. 5, In. 67); and decoding said speech using said 
speaker specific speech recognition system with a speaker model for an identified 
speaker until there is a speaker change (the operation of figure 3, at first when the 
speaker using the system is speaker specific, acoustic profile of that speaker is used in 
the speech recognition process. However, when a different speaker uses the system 
detected by the ID Speaker 204, the system retrieves profile of said different speaker for 
use in the speech recognition process). 
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10. Regarding claim 23, Murveit et al. further disclose a system for transcribing 
speech of a plurality of speakers, comprising: 

a memory that stores computer-readable code (element 104 and 106 in figure 2); 
and a processor operatively coupled to said memory, said processor configured to 
implement said computer-readable code (Processor 102 in figure 2), said computer- 
readable code configured to: 

provide said speech to a speaker independent speech recognition system and a 
speaker specific speech recognition system (referring to figures 3-5 or col. 4, In. 1 to col. 
5, In. 67); and decode said speech using said speaker independent speech recognition 
system whenever the identity of the current speaker is unknown (elements 206, 208, 
and 210 in figure 3). 

1 1 . Regarding claim 27, Murveit et al. disclose a system for transcribing speech of a 
plurality of speakers, comprising: 

a memory that stores computer-readable code (element 104 and 106 in figure 2); 
and a processor operatively coupled to said memory, said processor configured to 
implement said computer-readable code (Processor 102 in figure 2), said computer- 
readable code configured to: 

provide said speech to a speaker independent speech recognition system and a 
speaker specific speech recognition system (referring to figures 3-5 or col. 4, In. 1 to col. 
5, In. 67); and decode said speech using said speaker specific speech recognition 
system with a speaker model for an identified speaker until there is a speaker change 
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(the operation of figure 3, at first when the speaker using the system is speaker specific, 
acoustic profile of that speaker is used in the speech recognition process. However, 
when a different speaker uses the system detected by the ID Speaker 204, the system 
retrieves profile of said different speaker for use in the speech recognition process), 

12. Regarding claims 12-14, 17-18, 24-26, and 29-30, Murveit et al. further disclose 
the method and system of claims 11, 15, 23, and 27, wherein said decoding step 
continues until a speaker identification system identifies an unknown speaker (the 
operation in figure 3 is a continuous process, speaker changes are continuously 
monitored by the ID Speaker 204), wherein one or more of said speaker independent 
speech recognition system and said speaker specific speech recognition system are on 
a remote server (Speech Recognition System in figure 2), and the step of presenting 
said selected decoded output to a user (col. 7, In. 9-22). 

13. Regarding claims 16 and 28, Murveit et al. further disclose the method and 
system of claims 15 and 27, further comprising the step of decoding said speech using 
a speaker independent speech recognition system until the identity of a speaker is 
determined and the appropriate speaker model is loaded (the operation of figure 3, at 
first when the speaker using the system is speaker specific, acoustic profile of that 
speaker is used in the speech recognition process. However, when a different speaker 
uses the system detected by the ID Speaker 204, the system retrieves profile of said 
different speaker for use in the speech recognition process). 
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14. Regarding claim 32, Murveit et al. disclose an article of manufacture for 
transcribing speech of a plurality of speakers, comprising: 

a computer readable medium having computer readable code means embodied 
thereon (memory system 104 in figure 4), said computer readable program code means 
comprising: 

a step to provide said speech to a speaker independent speech recognition 
system and a speaker specific speech recognition system (referring to figures 3-5 orcoL 
4, In. 1 to col. 5, In. 67); and 

a step to decode said speech using said speaker independent speech 
recognition system whenever the identity of the current speaker is unknown (elements 
206, 208, and 210 in figure 3). 

15. Regarding claim 33, Murveit et al. disclose an article of manufacture for 
transcribing speech of a plurality of speakers, comprising: 

a computer readable medium having computer readable code means embodied 
thereon (memory system 104 in figure 4), said computer readable program code means 
comprising: 

a step to provide said speech to a speaker independent speech recognition 
system and a speaker specific speech recognition system (referring to figures 3-5 orcoL 
4, In. 1 to col. 5, In. 67); and 
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a step to decode said speech using said speaker specific speech recognition 
system with a speaker model for an identified speaker until there is a speaker change 
(the operation of figure 3, at first when the speaker using the system is speaker specific, 
acoustic profile of that speaker is used in the speech recognition process. However, 
when a different speaker uses the system detected by the ID Speaker 204, the system 
retrieves profile of said different speaker for use in the speech recognition process). 

Claim Rejections - 35 USC § 103 

16. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action; 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

17. Claims 5-7 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Garudadri et al. (US Patent No. 6671669) in view of Baker (US Patent No. 6122613). 

18. Regarding claims 5-7, Garudadri et al. fail to specifically disclose the method of 
claim 1 , further comprising the step of manually selecting an alternate decoded output if 
said assigned output is incorrect, the step of adapting said selecting step based on said 
manual selection, and the step of presenting several decoded outputs to a user with an 
indication of said corresponding confidence score. 
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However, Baker teach the step of manually selecting an alternate decoded 
output if said assigned output is incorrect (co/. 8, In. 33 to col. 9, In. 40), the step of 
adapting said selecting step based on said manual selection (col. 9, In. 18-40), and the 
step of presenting several decoded outputs to a user with an indication of said 
corresponding confidence score {col. 8, In. 33 to col. 9, In. 40). 

Since Garudadri et al. and Baker are analogous art because they are from the 
same field of endeavors, it would have been obvious to one of ordinary skill in the art at 
the time of invention to modify Garudadri et al. by incorporating the teaching of Baker in 
order to enable the user to correct misrecognized words and to train the system to 
enhance subsequent recognitions. 

19. Claims 3, 10, and 21 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Garudadri et al. (US Patent No. 6671669) in view of Murveit et al. (US Patent No. 
6671669). 

20. Regarding claim 10, Garudadri et al. fail to disclose the method of claim 1 , 
wherein said selecting step further comprises the step of determining if a decoded 
output includes an isolated word from a second speaker in a string of words from a first 
speaker. However, Murveit et al. teach that the selecting step further comprises the 
step of determining if a decoded output includes an isolated word from a second 
speaker in a string of words from a first speaker {Speaker ID 204 in figure 3 determines 
speaker changes). 
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Since Garudadri et al. and Murveit et al. are analgous art because they are from 
the same field of endeavors, it would have been obvious to one of ordinary skill in the 
art at the time of invention to modify Garudadri et al. by incorporating the teaching of 
Murveit et al. in order to specify the system to use speaker-specific profile in recognizing 
speech to enhance speech recognition accuracy. 

21 . Regarding claims 3 and 21 , Garudadri et al. fail to specifically disclose the 
method and system of claims 1 and 19, wherein one or more of said speech decoders 
are on a remote server. However, Murveit et al. teach that one or more of said speech 
decoders are on a remote server (Speech Recognition System 100 in figure 2). 

Since Garudadri et al. and Murveit et al. are analgous art because they are from 
the same field of endeavors, it would have been obvious to one of ordinary skill in the 
art at the time of invention to modify Garudadri et al. by incorporating the teaching of 
Murveit et al. in order to distribute the processing load from the limited-processing- 
capability device. 

22. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over Garudadri 
et al. (US Patent No. 6671669) in view of Chao Chang et al. (US Patent No. 6567778). 

23. Regarding claim 8, Garudadri et al. fail to specifically disclose the method of 
claim 1 , further comprising the step of presenting said decoded output as a string of 
words if said corresponding confidence score exceeds a certain threshold and as a 
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string of phones if said corresponding confidence score is below a certain threshold. 
However, Chao Chang et al. teach the step of presenting said decoded output as a 
string of words if said corresponding confidence score exceeds a certain threshold and 
as a string of phones if said corresponding confidence score is below a certain 
threshold (element 116 in figure 1, if the confidence score is lower than the threshold, 
the user is queried to re-input the low confidence score words. Thus, the query must 
represent a series of phones). 

Since Garudadri et al. and Chao Chang et al. are analgous art because they are 
from the same field of endeavors, it would have been obvious to one of ordinary skill in 
the art at the time of invention to modify Garudadri et al. by incorporating the teaching of 
Chao Chang et al. in order to allow the user to correct misrecognized words to enhance 
speech recognition in subsequent recognitions. 

Allowable Subject Matter 

24. Claim 9 is objected to as being dependent upon a rejected base claim, but would 
be allowable if rewritten in independent form including all of the limitations of the base 
claim and any intervening claims. 

The following is a statement of reasons for the indication of allowable subject 
matter: Garudadri et al. disclose a method and system that combines speech 
recognition engines and resolves any differences between the results of individual 
speech recognition engines {referring to figure 1). Murveit et al. teach a speech 
recognition adaptation system that whenever a new speaker is encountered, the system 
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uses the speaker independent models to carry out speech recognition. At the same 
time the speaker independent models are used to adapt the speech of the new speaker. 
The adapted speech for the speaker is stored in memory for use in subsequent speech 
recognition of this speaker (referring to figures 3-5). However, both Garudadri et al. and 
Murveit et al. fail to specifically disclose the step of "presenting said decoded output as 
a string of words for the decoded output having the highest confidence score and as 
phones or syllables for all other decoded outputs" Furthermore, it would have not been 
obvious to one of ordinary skill in the art at the time of invention to modify prior art of 
record to realize the claimed invention. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Huyen Vd whose telephone number is 703-305-8665. 
The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Doris To can be reached on 703-305-4827. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

Examiner Huyen X. Vo November 18, 2004 
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PRIMARY EXAMINER 




